Feature Ranking, Selection and Discretization
نویسندگان
چکیده
Many indices for evaluation of features have been considered. Applied to single features they allow for filtering irrelevant attributes. Algorithms for selection of subsets of features also remove redundant features. Hashing techniques enable efficient application of feature relevance indices to selection of feature subsets. A number of such methods have been applied to artificial and real-world data. Strong influence of continuous feature discretization and very good performance of separability-based discretization has been noted.
منابع مشابه
Feature Selection and Ranking Filters
Many feature selection and feature ranking methods have been proposed. Using real and artificial data an attempt has been made to compare some of these methods. The "feature relevance index" used seems to have little effect on the relative ranking. For continuous features discretization and kernel smoothing are compared. Selection of subsets of features using hashing techniques is compared with...
متن کاملA New Hybrid Framework for Filter based Feature Selection using Information Gain and Symmetric Uncertainty (TECHNICAL NOTE)
Feature selection is a pre-processing technique used for eliminating the irrelevant and redundant features which results in enhancing the performance of the classifiers. When a dataset contains more irrelevant and redundant features, it fails to increase the accuracy and also reduces the performance of the classifiers. To avoid them, this paper presents a new hybrid feature selection method usi...
متن کاملA hybrid filter-based feature selection method via hesitant fuzzy and rough sets concepts
High dimensional microarray datasets are difficult to classify since they have many features with small number ofinstances and imbalanced distribution of classes. This paper proposes a filter-based feature selection method to improvethe classification performance of microarray datasets by selecting the significant features. Combining the concepts ofrough sets, weighted rough set, fuzzy rough se...
متن کاملMutual information-based feature selection and partition design in fuzzy rule-based classifiers from vague data
Algorithms for preprocessing databases with incomplete and imprecise data are seldom studied. For the most part, we lack numerical tools to quantify the mutual information between fuzzy random variables. Therefore, these algorithms (discretization, instance selection, feature selection, etc.) have to use crisp estimations of the interdependency between continuous variables, whose application to...
متن کاملContext-Sensitive Attribute Evaluation
The research in machine learning, data mining, and statistics has provided a number of methods that estimate the usefulness of an attribute (feature) for prediction of the target variable. The estimates of attributes’ utility are subsequently used in various important tasks, e.g., feature subset selection, feature weighting, feature ranking, feature construction, data transformation, decision a...
متن کامل